12 research outputs found

    Better Conversations by Modeling,Filtering,and Optimizing for Coherence and Diversity

    Full text link
    We present three enhancements to existing encoder-decoder models for open-domain conversational agents, aimed at effectively modeling coherence and promoting output diversity: (1) We introduce a measure of coherence as the GloVe embedding similarity between the dialogue context and the generated response, (2) we filter our training corpora based on the measure of coherence to obtain topically coherent and lexically diverse context-response pairs, (3) we then train a response generator using a conditional variational autoencoder model that incorporates the measure of coherence as a latent variable and uses a context gate to guarantee topical consistency with the context and promote lexical diversity. Experiments on the OpenSubtitles corpus show a substantial improvement over competitive neural models in terms of BLEU score as well as metrics of coherence and diversity

    Semantic consistency in text generation

    Get PDF
    Automatic input-grounded text generation tasks process input texts and generate human-understandable natural language text for the processed information. The development of neural sequence-to-sequence (seq2seq) models, which are usually trained in an end-to-end fashion, pushed the frontier of the performance on text generation tasks expeditiously. However, they are claimed to be defective in semantic consistency w.r.t. their corresponding input texts. Also, not only the models are to blame. The corpora themselves always include examples whose output is semantically inconsistent to its input. Any model that is agnostic to such data divergence issues will be prone to semantic inconsistency. Meanwhile, the most widely-used overlap-based evaluation metrics comparing the generated texts to their corresponding references do not evaluate the input-output semantic consistency explicitly, which makes this problem hard to detect. In this thesis, we focus on studying semantic consistency in three automatic text generation scenarios: Data-to-text Generation, Single Document Abstractive Summarization, and Chit-chat Dialogue Generation, by seeking for the answers to the following research questions: (1) how to define input-output semantic consistency in different text generation tasks? (2) how to quantitatively evaluate the input-output semantic consistency? (3) how to achieve better semantic consistency in individual tasks? We systematically define the semantic inconsistency phenomena in these three tasks as omission, intrinsic hallucination, and extrinsic hallucination. For Data-to-text Generation, we jointly learn a sentence planner that tightly controls which part of input source gets generated in what sequence, with a neural seq2seq text generator, to decrease all three types of semantic inconsistency in model-generated texts. The evaluation results confirm that the texts generated by our model contain much less omissions while maintaining low level of extrinsic hallucinations without sacrificing fluency compared to seq2seq models. For Single Document Abstractive Summarization, we reduce the level of extrinsic hallucinations in training data by automatically introducing assisting articles to each document-summary instance to provide the supplemental world-knowledge that is present in the summary but missing from the doc ument. With the help of a novel metric, we show that seq2seq models trained with as sisting articles demonstrate less extrinsic hallucinations than the ones trained without them. For Chit-chat Dialogue Generation, by filtering out the omitted and hallucinated examples from training set using a newly introduced evaluation metric, and encoding it into the neural seq2seq response generation models as a control factor, we diminish the level of omissions and extrinsic hallucinations in the generated dialogue responses

    An Ensemble Model with Ranking for Social Dialogue

    Full text link
    Open-domain social dialogue is one of the long-standing goals of Artificial Intelligence. This year, the Amazon Alexa Prize challenge was announced for the first time, where real customers get to rate systems developed by leading universities worldwide. The aim of the challenge is to converse "coherently and engagingly with humans on popular topics for 20 minutes". We describe our Alexa Prize system (called 'Alana') consisting of an ensemble of bots, combining rule-based and machine learning systems, and using a contextual ranking mechanism to choose a system response. The ranker was trained on real user feedback received during the competition, where we address the problem of how to train on the noisy and sparse feedback obtained during the competition.Comment: NIPS 2017 Workshop on Conversational A

    Datasets and benchmarks for task-oriented log dialogue ranking task

    No full text

    Better conversations by modeling, filtering, and optimizing for coherence and diversity

    No full text

    AggGen:Ordering and aggregating while generating

    No full text
    We present AGGGEN (pronounced 'again'), a data-to-text model which re-introduces two explicit sentence planning stages into neural data-to-text systems: input ordering and input aggregation. In contrast to previous work using sentence planning, our model is still end-to-end: AGGGEN performs sentence planning at the same time as generating text by learning latent alignments (via semantic facts) between input representation and target text. Experiments on the WebNLG and E2E challenge data show that by using fact-based alignments our approach is more interpretable, expressive, robust to noise, and easier to control, while retaining the advantages of end-to-end systems in terms of fluency. Our code is available at https://github.com/XinnuoXu/AggGen.Comment: Correct the first citation in the Zero-shot Few-shot scenarios paragraph in Section
    corecore